Skip to content

Conversation

dnhatn
Copy link
Member

@dnhatn dnhatn commented Aug 28, 2025

Currently, we use a local breaker that over-reserves memory for each driver to reduce frequent calls to the global circuit breaker, which can be expensive. After reviewing some profiles, the maximum reserved amount appears too small and does not significantly reduce calls to the global breaker.

The reserved amount should cover at least several pages of data (~256KB by default). This change proposes increasing the maximum reserved to two pages (512KB). One downside is that we may hit the circuit breaker earlier under extremely tight memory conditions, but the impact should be minimal. For example, on a node with 8 CPUs, the total reserved could be 12 * 0.5 = 6MB. In cases with many long-running queries that frequently wake and sleep, the extra over-reserved memory may be more noticeable. However, even with 100 drivers, the total maximum would be only 50MB.

@dnhatn dnhatn requested a review from nik9000 August 28, 2025 22:43
@dnhatn dnhatn marked this pull request as ready for review August 28, 2025 22:43
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Aug 28, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@dnhatn dnhatn requested a review from martijnvg August 29, 2025 02:45
Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've observed this as well in flame graphs, LGTM 👍

@dnhatn dnhatn merged commit a977c95 into elastic:main Aug 29, 2025
33 checks passed
@dnhatn dnhatn deleted the pump-local-breaker branch August 29, 2025 05:46
@dnhatn
Copy link
Member Author

dnhatn commented Aug 29, 2025

Thanks Martijn!

JeremyDahlgren pushed a commit to JeremyDahlgren/elasticsearch that referenced this pull request Aug 29, 2025
Currently, we use a local breaker that over-reserves memory for each 
driver to reduce frequent calls to the global circuit breaker, which can
be expensive.  After reviewing some profiles, the maximum reserved
amount appears too small and does not significantly reduce calls to the
global breaker.

The reserved amount should cover at least several pages of data (~256KB 
by default). This change proposes increasing the maximum reserved to two
pages (512KB).  One downside is that we may hit the circuit breaker
earlier under extremely tight memory conditions, but the impact should
be minimal. For example, on a node with 8 CPUs, the total reserved could
be 12 * 0.5 = 6MB. In cases with many long-running queries that
frequently wake and sleep, the extra over-reserved memory may be more
noticeable. However, even with 100 drivers, the total maximum would be
only 50MB.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >non-issue Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants